THE ADAPTIVE METHOD FOR OPTIMIZATION OF RULE-BASED KNOWLEDGE BASES IN REAL-TIME INTELLIGENT AGENTS
V. V. Denisov, M. G. Panteleev
St. Petersburg Electrotechnical University “LETI”
Àííîòàöèÿ – Ðàññìàòðèâàþòñÿ ìåòîäû îïòèìèçàöèè âíóòðåííåãî ïðåäñòàâëåíèÿ ïðîäóêöèîííûõ áàç çíàíèé â ôîðìå äåðåâüåâ ðåøåíèé äëÿ èíòåëëåêòóàëüíûõ àãåíòîâ ðåàëüíîãî âðåìåíè, îòëè÷àþùèåñÿ îò èçâåñ òíûõ ó÷åòîì ðàçëè÷íîé çíà÷íîñòè ïåðåìåííûõ. Çàäà÷à ñèíòåçà îïòèìàëüíûõ ïî ñëîæíîñòè äåðåâüåâ ðåøåíèé â äàííîé ïîñòàíîâêå ÿâëÿåòñÿ NP - ïîëíîé. Ïðåäëîæåí àäàïòèâíûé ìåòîä ñèíòåçà, ðåàëèçóþùèé êîìïðîìèññ ìåæäó èìåþùèìñÿ âðåìåíåì è êà÷åñòâîì ðåøåíèÿ. Ïðèâåä åíû ðåçóëüòàòû ýêñïåðèìåíòàëüíîé îöåíêè ýôôåêòèâíîñòè àëãîðèòìà.
Introduction. The intelligent agents creation is one of the major directions in artificial intelligence. Intelligent agents are the high-level and highly integrated knowledge based systems, possessing the characteristics of autonomy and goal-directing [1]. The rule-based (production) model of knowledge representation is widely used for constructing of intelligent agents. For example, in SOAR system [2] both domain and control knowledge are represented as production rules. T he basic requirement to design of intelligent agents in wide range of applications (e.g. embedded real-time systems) is the effective use of resources. Different production systems (PS) representation forms are known: decision tables, decision trees (DT) , RETE and METE graphs, etc [3]. However, exactly DT have knowledge base presentation compactness and inference engine simplicity needed for the embedded applications [4]. Thus, the actual problem is the task of knowledge bases optimization in DT form. D ifferent simple and fast methods for DT construction (for example “sum of dashes”) are known [4]. These methods have low computational complexity, but do not guarantee optimal solutions. Moreover, they do not consider the factor of the different number o f values of the variables. The proposed approach takes into account this feature.
Optimal method for DT building. The general algorithm of DT construction is the depth DT traversal with choice of the drawing-forward variable on each step. The essence of the issue is to define the criterion for the drawing-for ward variable choice. That is the exact distinguishing feature of DT synthesis methods. The complexity optimization becomes feasible due to the following factors:
the presence of insufficient variables on some sets of rules;
check order in case of different number of values of variables;
Other matters being equal, the minimum complexity tree is the one, where variables are checked in the number of values ascending order, though in a number of cases the shift of insufficient variables to the lower level gives more benef icial configuration to the resulting graph.
Lets consider the path in the already designed graph fragment, to which the combination w of variable values corresponds. Particularly, on the first step of graph synthesis, w is the empty combination: w=e. M(w) is the set of variables, with their values presented in w conjunction, O M(w) is the set of variables, with their values not presented in w, Wi(w) is the set of situations, covered by w conjunction, for those variable Xi&# 206; O M(w) is insignificant. It’s obvious, that the drawing-forward of w path for such variable Xi with the minimum number of values, where does not worsen the complexity of the resulting graph. We shall call by “conflict situation” (CS) a situation, without such variable. When XiI M(w), Xi[w] will be the value of variable Xi in combination w. We shall consider two combinations wr and ws, where M(wr)I M(ws). We can say, that wr covers ws, given that:
We shall call the situation with key ws subordinate to the situation with key wr, if wr covers ws. We shall name the situation with key ws directly subordinate to the situation with key wr, provided that wr covers ws, and there is no such a situation with key wt, when wr covers wt and wt covers ws. Let us call the w situation solution cost the variable Xi and mark C(w,Xi) the complexity of the optimal subgraph, obtained from the w situation solution by means of variable Xi:
where Ii - the number of values of Xi variable, wij is the combination w, drawn forward by j value of Xi variable . Value C(w,Xi) is a goal function of the optimal drawing-forward variable choice for the w path. The minimisation of C(w,Xi) value means the minimisation of the DT complexity.
The set of situations with given relation of the direct subordination can be suitably presented in Hasse Diagram, and name i t in terms of the analysed interpretation - the graph of situations’ subordination (GSS). For the convenience of the defining of the optimal resolution variable, we shall intoduce the second type nodes to GSS, corresponding to variables-nodes (V-nodes). From each situation-node (S-node), go out the arcs, according to the number of conflicting variables, if we are dealing with the conflict situation, or only one arc in the opposite case, locked onto V-nodes. From V-nodes go out the arcs, locked by the S- nodes, directly subordinate to the situation-predecessor of the analysed V-node. We shall call such a graph the expanded graph of situations’ subordination (EGSS) (Fig.1). Defining the optimal resolution variables is conducted by means of EGSS ascent, fr om final nodes to the root. Each V-node increases the cost value of covering situation by one. The optimal resolution variables are defined with variable C(w,Xi) minimum. Thus, the procedure of the second item cal culation in expression (1) is recursive, whereas the subordinate situations are revealed by means of going through different drawing-forward combinations. The complexity of going through grows exponentially with the growth of the number of the variables and their values. In Table 1 we can see the number of steps for the optimal tree construction algorithm in the worst case for decision tables, where the number of values for all variables equals 4.Table 1.Complexity growth of the optimal decision
The number of variables |
The number of EGSS synthesis algorithm steps |
3 |
493 |
4 |
7889 |
5 |
157787 |
6 |
3786745 |
7 |
106028861 |
8 |
3392923553 |
Fig.1. The fragment of EGSS.
the variables — tree synthesis algorithm. Step execution time for PC Pentium200 » 2ms.Reduction of calculations becomes possible due to the directed search out, using “branch and bound method”. The essence of the method is the following: if in the analysis prosess of some CS resolution variant, its cost didn’t be come less, then the cost of the best variant from the already considered, analysis stops immediately and the variant is discarded. Initially, for each variant of the CS resolution, the probability of that it turns out optimal (accordingly with the least cost value) is determined. The variants are analysed in the estimation value descending order. As the result of the experiments, the directed search cuts down about 25-30% of synthesis time. The benefit is strongly obvious in case of variables’ different number of values. EGSS-based tree design method guarantees optimal results, but already when we have seven 4-values variables, synthesis takes more than a day. It makes the use of this method practically impossible.
The adaptive DT synthesis method. This method regulates the degree of EGSS-based algorithm use. The main idea of the adaptive algorithm is the tree constructing by means of some fast (for example, “sum of dashes”) method in comb ination with EGSS-based synthesis method. To determine, which subgraphs should be built upon EGSS, the algorithm considers the time of decision making and performance of using computing platform.
The computational complexity of the “small” (2-5 variables) trees synthesis based on EGSS, can be compared to the fast method complexity, whereas the benefit is sufficient — e.g. two nodes (22%) for 3 inputs variables. That is obvious, that DT with large number of variables contains sufficient amount of similar subgraphs. The complexity of full tree includes the sum of these subgraphs complexities. While subgraph constructing upon EGSS, we decrease general complexity of the resulting tree with insufficient grows of computing expenses.
The number of steps, needed for the graph construction grows rapidly with the growth of number of variables. For example, while making 8000 steps, we can design 16 3-level subgraphs or one 4-level subgraph (Table 1). That is clear, tha t it is more beneficial to use the time for the synthesis of the large number of small subgraphs, than the small number of large subgraphs. The algorithm fo implementation of proposed method is presented below.
Method Build(n, Table)
Input data: N
v – the number of input variables of the initial DT; l – the number of top level, from which synthesis EGSS based method starts functioning, the levels are numbe red from 0 to (Nv–1)in depth from the root; Mk – the number of not yet built subgraphs of k level; Sk — th e number of steps in the worst case, necessary for the subgraph of the k-level design by means of EGSS synthesis method; t s — time of one step of EGSS synthesis algorithm completion; t — period of time not yet used at the given moment; Table – the current DT; n – the current level, where synthesis goes.The result: DT designed in procedures BuildEGSSBasedMethod () and AddToTree ().
1 {If (conflict situation)
2 {If ( n ? l ) /* current level analysis */
3 If(S
l ? t/t s ) /* analysis of the current time */4 { BuildEGSSBasedMethod( );
5 M
n:= Mn -1; /* correction of Mn */6 If( l>0 & ] M
l /Imin[Sl-1? t/t s )7 l:=l-1;
8 return;
9 }
10 Else If ( l< N
v -2 ) l:=l+111 Else l:= N
v;12 }
13 X
j=GiveUpdatingVariable ( );14 AddToTree(X
j);15 M
n:= Mn -1; /* correction of Mn */16 For( every value of X
j ) 17 Build( n+1,NewDecisionTable(Table, Xj );18 return; }
After the update of the number of subgraphs, left on level n (s.5), the correction of the threshold may occur. The complexity and, consequently, the time are always computed for the worst case. It is clear, that the real time is smaller, because in the number of cases the choice of the building variable is obvious. Then the calculation amount decreases due to directed search by means of “branch and bound method”. Thus after each call of EGSS synthesis procedure, there is some time left, that can be used for lager complexity subgraphs’ synthesis. The number of steps, needed for subgraphs constructing depends exponentially from level l. It is necessary to increase the threshold without decreasing the operati ng effectiveness. It is obvious, that each subgraph on k level includes Ir subgraphs of (k+1) level; Ir — the number of arcs going out of its root. Therefore, going through k leve l is not less effective in terms of complexity reduction, than searching Ir of its subgraphs of (k+1) level. Thus we get the condition of decrease threshold l (s.6). If there is no time for going through even one l-level subgraph, then threshold grows (ss.10-11). In cases when the situation is not conflict or EGSS synthesis method cannot be applied, the variable is selected by means of fast method.
Experimental results. The method was program implemented in Ñ++, with simulation (during the 100 hours) on PC Pentium200. As the results of the simulation decision trees were designed for 800 decisio n tables of various classes with the number of input variables from 3 to 10. The experiment has shown that the adaptive method gives up to 20% of the resulting tree complexity reduction in compare to know fast methods (such as “sum of dashes”). Co mplexity advantages rise along the grows of the number of variables (Fig.2).
Conclusion. The adaptive method of DT construction provides the following advantages:
Fig.2. Complexity advantages rise along with the grows of the number of variables.
References
1. Wooldridge M. and Jennings N.R. Intelligent agents: Theory and practice. The Knowledge Engineering Review, 10(2),115–152, 1995.
2. Rosenbloom P.S., Laird J.E., Newell A., McCarl R. A preliminary analysis of the Soar architecture as a basis for general intelligence. Artificial Intelligence, 1991, v.47, p.289-325.
3. Popov E.V. Expert systems. -
Ìoscow, Nauka, 1987. - 288p.4. Humby E., Programs from decision tables. -Computer monographs, 1973.— 84p.
Site of Information
Technologies Designed by inftech@webservis.ru. |
|